Overview

Dataset statistics

Number of variables13
Number of observations6717
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory682.3 KiB
Average record size in memory104.0 B

Variable types

Numeric8
Categorical5

Alerts

name has a high cardinality: 1982 distinct values High cardinality
year is highly correlated with selling_price and 1 other fieldsHigh correlation
selling_price is highly correlated with year and 1 other fieldsHigh correlation
km_driven is highly correlated with yearHigh correlation
seats is highly correlated with engine CCHigh correlation
engine CC is highly correlated with seats and 1 other fieldsHigh correlation
power BHP is highly correlated with selling_price and 1 other fieldsHigh correlation
selling_price is highly correlated with power BHPHigh correlation
seats is highly correlated with engine CCHigh correlation
mileage KMPL is highly correlated with engine CCHigh correlation
engine CC is highly correlated with seats and 2 other fieldsHigh correlation
power BHP is highly correlated with selling_price and 1 other fieldsHigh correlation
year is highly correlated with selling_priceHigh correlation
selling_price is highly correlated with yearHigh correlation
engine CC is highly correlated with power BHPHigh correlation
power BHP is highly correlated with engine CCHigh correlation
year is highly correlated with ownerHigh correlation
selling_price is highly correlated with owner and 1 other fieldsHigh correlation
fuel is highly correlated with engine CCHigh correlation
transmission is highly correlated with power BHPHigh correlation
owner is highly correlated with year and 1 other fieldsHigh correlation
seats is highly correlated with mileage KMPL and 1 other fieldsHigh correlation
mileage KMPL is highly correlated with seats and 1 other fieldsHigh correlation
engine CC is highly correlated with fuel and 3 other fieldsHigh correlation
power BHP is highly correlated with selling_price and 2 other fieldsHigh correlation
df_index has unique values Unique

Reproduction

Analysis started2022-04-29 02:11:15.641669
Analysis finished2022-04-29 02:11:24.449884
Duration8.81 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct6717
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3957.142177
Minimum0
Maximum8125
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size52.6 KiB
2022-04-29T07:41:24.536587image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile350.8
Q11915
median3869
Q35992
95-th percentile7669.2
Maximum8125
Range8125
Interquartile range (IQR)4077

Descriptive statistics

Standard deviation2361.800637
Coefficient of variation (CV)0.596845029
Kurtosis-1.222934817
Mean3957.142177
Median Absolute Deviation (MAD)2038
Skewness0.05171443885
Sum26580124
Variance5578102.25
MonotonicityStrictly increasing
2022-04-29T07:41:24.651906image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
53811
 
< 0.1%
53531
 
< 0.1%
53521
 
< 0.1%
53501
 
< 0.1%
53491
 
< 0.1%
53481
 
< 0.1%
53471
 
< 0.1%
53461
 
< 0.1%
53451
 
< 0.1%
Other values (6707)6707
99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
81251
< 0.1%
81241
< 0.1%
81231
< 0.1%
81221
< 0.1%
81211
< 0.1%
81201
< 0.1%
81191
< 0.1%
81181
< 0.1%
81161
< 0.1%
81151
< 0.1%

name
Categorical

HIGH CARDINALITY

Distinct1982
Distinct (%)29.5%
Missing0
Missing (%)0.0%
Memory size52.6 KiB
Maruti Swift Dzire VDI
 
118
Maruti Alto 800 LXI
 
76
Maruti Alto LXi
 
69
Maruti Swift VDI
 
60
Maruti Alto K10 VXI
 
47
Other values (1977)
6347 

Length

Max length54
Median length24
Mean length25.19994045
Min length11

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique898 ?
Unique (%)13.4%

Sample

1st rowMaruti Swift Dzire VDI
2nd rowSkoda Rapid 1.5 TDI Ambition
3rd rowHonda City 2017-2020 EXi
4th rowHyundai i20 Sportz Diesel
5th rowMaruti Swift VXI BSIII

Common Values

ValueCountFrequency (%)
Maruti Swift Dzire VDI118
 
1.8%
Maruti Alto 800 LXI76
 
1.1%
Maruti Alto LXi69
 
1.0%
Maruti Swift VDI60
 
0.9%
Maruti Alto K10 VXI47
 
0.7%
Hyundai EON Era Plus44
 
0.7%
Maruti Wagon R VXI BS IV43
 
0.6%
Maruti Alto LX43
 
0.6%
Maruti Ertiga VDI42
 
0.6%
Maruti Ritz VDi40
 
0.6%
Other values (1972)6135
91.3%

Length

2022-04-29T07:41:24.771681image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
maruti2089
 
6.6%
hyundai1214
 
3.8%
mahindra709
 
2.2%
tata633
 
2.0%
swift620
 
2.0%
diesel545
 
1.7%
bsiv542
 
1.7%
1.2502
 
1.6%
vxi476
 
1.5%
plus475
 
1.5%
Other values (825)23905
75.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

year
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct27
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2013.611136
Minimum1994
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size52.6 KiB
2022-04-29T07:41:24.859556image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1994
5-th percentile2007
Q12011
median2014
Q32017
95-th percentile2019
Maximum2020
Range26
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.897401569
Coefficient of variation (CV)0.001935528414
Kurtosis1.166627829
Mean2013.611136
Median Absolute Deviation (MAD)3
Skewness-0.931471173
Sum13525426
Variance15.18973899
MonotonicityNot monotonic
2022-04-29T07:41:24.951541image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
2017802
11.9%
2016691
10.3%
2015680
10.1%
2018607
9.0%
2014580
8.6%
2012576
8.6%
2013560
8.3%
2011535
8.0%
2010361
 
5.4%
2019347
 
5.2%
Other values (17)978
14.6%
ValueCountFrequency (%)
19942
 
< 0.1%
19951
 
< 0.1%
19962
 
< 0.1%
19979
 
0.1%
19989
 
0.1%
199913
 
0.2%
200014
 
0.2%
20016
 
0.1%
200219
0.3%
200336
0.5%
ValueCountFrequency (%)
202063
 
0.9%
2019347
5.2%
2018607
9.0%
2017802
11.9%
2016691
10.3%
2015680
10.1%
2014580
8.6%
2013560
8.3%
2012576
8.6%
2011535
8.0%

selling_price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct670
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean526385.997
Minimum29999
Maximum10000000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size52.6 KiB
2022-04-29T07:41:25.052756image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum29999
5-th percentile110000
Q1250000
median420000
Q3650000
95-th percentile1200000
Maximum10000000
Range9970001
Interquartile range (IQR)400000

Descriptive statistics

Standard deviation523550.4483
Coefficient of variation (CV)0.994613176
Kurtosis52.48996792
Mean526385.997
Median Absolute Deviation (MAD)185000
Skewness5.57076348
Sum3535734742
Variance2.741050719 × 1011
MonotonicityNot monotonic
2022-04-29T07:41:25.163356image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
300000208
 
3.1%
350000196
 
2.9%
600000167
 
2.5%
400000164
 
2.4%
250000161
 
2.4%
550000160
 
2.4%
500000160
 
2.4%
450000147
 
2.2%
650000145
 
2.2%
200000134
 
2.0%
Other values (660)5075
75.6%
ValueCountFrequency (%)
299991
 
< 0.1%
300001
 
< 0.1%
310001
 
< 0.1%
315041
 
< 0.1%
333511
 
< 0.1%
350003
 
< 0.1%
390001
 
< 0.1%
4000011
0.2%
420002
 
< 0.1%
4500021
0.3%
ValueCountFrequency (%)
100000001
 
< 0.1%
72000001
 
< 0.1%
65230001
 
< 0.1%
62230001
 
< 0.1%
60000003
< 0.1%
59230001
 
< 0.1%
58500001
 
< 0.1%
58300001
 
< 0.1%
58000002
< 0.1%
55000004
0.1%

km_driven
Real number (ℝ≥0)

HIGH CORRELATION

Distinct898
Distinct (%)13.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean73398.33765
Minimum1
Maximum2360457
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size52.6 KiB
2022-04-29T07:41:25.269394image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile11500
Q138000
median68203
Q3100000
95-th percentile155000
Maximum2360457
Range2360456
Interquartile range (IQR)62000

Descriptive statistics

Standard deviation58703.27527
Coefficient of variation (CV)0.7997902561
Kurtosis397.3335813
Mean73398.33765
Median Absolute Deviation (MAD)31797
Skewness11.91618609
Sum493016634
Variance3446074527
MonotonicityNot monotonic
2022-04-29T07:41:25.374089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
120000487
 
7.3%
70000420
 
6.3%
80000405
 
6.0%
60000383
 
5.7%
50000346
 
5.2%
100000305
 
4.5%
90000301
 
4.5%
40000280
 
4.2%
110000256
 
3.8%
30000213
 
3.2%
Other values (888)3321
49.4%
ValueCountFrequency (%)
11
 
< 0.1%
10005
0.1%
13001
 
< 0.1%
13031
 
< 0.1%
15002
 
< 0.1%
16001
 
< 0.1%
16201
 
< 0.1%
20007
0.1%
21181
 
< 0.1%
21361
 
< 0.1%
ValueCountFrequency (%)
23604571
< 0.1%
15000001
< 0.1%
5774141
< 0.1%
5000002
< 0.1%
4750001
< 0.1%
4400001
< 0.1%
4260001
< 0.1%
3800001
< 0.1%
3764121
< 0.1%
3750001
< 0.1%

fuel
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size52.6 KiB
Diesel
3658 
Petrol
2973 
CNG
 
51
LPG
 
35

Length

Max length6
Median length6
Mean length5.961589996
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDiesel
2nd rowDiesel
3rd rowPetrol
4th rowDiesel
5th rowPetrol

Common Values

ValueCountFrequency (%)
Diesel3658
54.5%
Petrol2973
44.3%
CNG51
 
0.8%
LPG35
 
0.5%

Length

2022-04-29T07:41:25.467020image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-29T07:41:25.524043image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
diesel3658
54.5%
petrol2973
44.3%
cng51
 
0.8%
lpg35
 
0.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

seller_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.6 KiB
Individual
6024 
Dealer
666 
Trustmark Dealer
 
27

Length

Max length16
Median length10
Mean length9.627512282
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIndividual
2nd rowIndividual
3rd rowIndividual
4th rowIndividual
5th rowIndividual

Common Values

ValueCountFrequency (%)
Individual6024
89.7%
Dealer666
 
9.9%
Trustmark Dealer27
 
0.4%

Length

2022-04-29T07:41:25.765015image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-29T07:41:25.821917image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
individual6024
89.3%
dealer693
 
10.3%
trustmark27
 
0.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

transmission
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.6 KiB
Manual
6142 
Automatic
 
575

Length

Max length9
Median length6
Mean length6.256811076
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowManual
2nd rowManual
3rd rowManual
4th rowManual
5th rowManual

Common Values

ValueCountFrequency (%)
Manual6142
91.4%
Automatic575
 
8.6%

Length

2022-04-29T07:41:25.881977image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-29T07:41:25.934055image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
manual6142
91.4%
automatic575
 
8.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

owner
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size52.6 KiB
First Owner
4176 
Second Owner
1888 
Third Owner
493 
Fourth & Above Owner
 
155
Test Drive Car
 
5

Length

Max length20
Median length11
Mean length11.490993
Min length11

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFirst Owner
2nd rowSecond Owner
3rd rowThird Owner
4th rowFirst Owner
5th rowFirst Owner

Common Values

ValueCountFrequency (%)
First Owner4176
62.2%
Second Owner1888
28.1%
Third Owner493
 
7.3%
Fourth & Above Owner155
 
2.3%
Test Drive Car5
 
0.1%

Length

2022-04-29T07:41:25.986846image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-29T07:41:26.038645image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
owner6712
48.8%
first4176
30.4%
second1888
 
13.7%
third493
 
3.6%
fourth155
 
1.1%
155
 
1.1%
above155
 
1.1%
test5
 
< 0.1%
drive5
 
< 0.1%
car5
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

seats
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.434271252
Minimum2
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size52.6 KiB
2022-04-29T07:41:26.094176image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile5
Q15
median5
Q35
95-th percentile7
Maximum14
Range12
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.9838050408
Coefficient of variation (CV)0.1810371612
Kurtosis3.608223114
Mean5.434271252
Median Absolute Deviation (MAD)0
Skewness1.919315131
Sum36502
Variance0.9678723582
MonotonicityNot monotonic
2022-04-29T07:41:26.166598image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
55254
78.2%
7966
 
14.4%
8221
 
3.3%
4124
 
1.8%
974
 
1.1%
657
 
0.8%
1018
 
0.3%
22
 
< 0.1%
141
 
< 0.1%
ValueCountFrequency (%)
22
 
< 0.1%
4124
 
1.8%
55254
78.2%
657
 
0.8%
7966
 
14.4%
8221
 
3.3%
974
 
1.1%
1018
 
0.3%
141
 
< 0.1%
ValueCountFrequency (%)
141
 
< 0.1%
1018
 
0.3%
974
 
1.1%
8221
 
3.3%
7966
 
14.4%
657
 
0.8%
55254
78.2%
4124
 
1.8%
22
 
< 0.1%

mileage KMPL
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct376
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.49667832
Minimum9
Maximum30.46
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size52.6 KiB
2022-04-29T07:41:26.258803image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile12.9
Q116.8
median19.46658478
Q322.5
95-th percentile25.83
Maximum30.46
Range21.46
Interquartile range (IQR)5.7

Descriptive statistics

Standard deviation3.915238225
Coefficient of variation (CV)0.2008156549
Kurtosis-0.510254775
Mean19.49667832
Median Absolute Deviation (MAD)2.853415215
Skewness0.007906554036
Sum130959.1883
Variance15.32909036
MonotonicityNot monotonic
2022-04-29T07:41:26.357767image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18.9210
 
3.1%
19.7168
 
2.5%
18.6150
 
2.2%
21.1147
 
2.2%
17124
 
1.8%
15.96108
 
1.6%
16.1106
 
1.6%
17.896
 
1.4%
12.888
 
1.3%
15.186
 
1.3%
Other values (366)5434
80.9%
ValueCountFrequency (%)
94
 
0.1%
9.51
 
< 0.1%
102
 
< 0.1%
10.12
 
< 0.1%
10.517
0.3%
10.711
 
< 0.1%
10.752
 
< 0.1%
10.81
 
< 0.1%
10.93
 
< 0.1%
10.914
 
0.1%
ValueCountFrequency (%)
30.462
 
< 0.1%
28.485
1.3%
28.0931
 
0.5%
27.626
 
0.1%
27.44
 
0.1%
27.3924
 
0.4%
27.310
 
0.1%
27.2813
 
0.2%
26.832
 
< 0.1%
26.83
 
< 0.1%

engine CC
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct121
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1433.989377
Minimum793
Maximum3604
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size52.6 KiB
2022-04-29T07:41:26.455485image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum793
5-th percentile796
Q11197
median1248
Q31498
95-th percentile2499
Maximum3604
Range2811
Interquartile range (IQR)301

Descriptive statistics

Standard deviation490.9976244
Coefficient of variation (CV)0.342399764
Kurtosis0.9926247184
Mean1433.989377
Median Absolute Deviation (MAD)245
Skewness1.232357125
Sum9632106.646
Variance241078.6671
MonotonicityNot monotonic
2022-04-29T07:41:26.554456image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1248907
 
13.5%
1197698
 
10.4%
796420
 
6.3%
998398
 
5.9%
2179339
 
5.0%
1498336
 
5.0%
1396281
 
4.2%
1199192
 
2.9%
2523183
 
2.7%
1461169
 
2.5%
Other values (111)2794
41.6%
ValueCountFrequency (%)
7936
 
0.1%
796420
6.3%
79971
 
1.1%
814111
 
1.7%
9092
 
< 0.1%
93634
 
0.5%
99326
 
0.4%
99542
 
0.6%
998398
5.9%
99969
 
1.0%
ValueCountFrequency (%)
36041
 
< 0.1%
34981
 
< 0.1%
31984
 
0.1%
29992
 
< 0.1%
29972
 
< 0.1%
299313
0.2%
29878
 
0.1%
298227
0.4%
29678
 
0.1%
295619
0.3%

power BHP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct318
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean87.76610019
Minimum32.8
Maximum400
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size52.6 KiB
2022-04-29T07:41:26.660036image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum32.8
5-th percentile47.3
Q167.1
median81.83
Q3100
95-th percentile147.9
Maximum400
Range367.2
Interquartile range (IQR)32.9

Descriptive statistics

Standard deviation31.72455521
Coefficient of variation (CV)0.3614670714
Kurtosis5.426867236
Mean87.76610019
Median Absolute Deviation (MAD)14.79
Skewness1.711327357
Sum589524.895
Variance1006.447403
MonotonicityNot monotonic
2022-04-29T07:41:26.760860image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
74324
 
4.8%
88.5193
 
2.9%
46.3158
 
2.4%
67152
 
2.3%
67.1141
 
2.1%
81.8136
 
2.0%
67.04136
 
2.0%
70134
 
2.0%
47.3131
 
2.0%
62.1130
 
1.9%
Other values (308)5082
75.7%
ValueCountFrequency (%)
32.82
 
< 0.1%
34.220
 
0.3%
3519
 
0.3%
35.52
 
< 0.1%
3788
1.3%
37.4811
 
0.2%
37.56
 
0.1%
382
 
< 0.1%
38.42
 
< 0.1%
40.32
 
< 0.1%
ValueCountFrequency (%)
4001
 
< 0.1%
2821
 
< 0.1%
2801
 
< 0.1%
2721
 
< 0.1%
270.93
< 0.1%
2651
 
< 0.1%
261.44
0.1%
2582
< 0.1%
254.83
< 0.1%
254.791
 
< 0.1%

Interactions

2022-04-29T07:41:23.367549image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:18.039100image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:18.826331image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:19.537874image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:20.383595image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:21.093724image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:21.818974image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:22.512275image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:23.468204image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:18.157026image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:18.921105image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:19.633403image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:20.483232image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:21.189480image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:21.910998image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:22.726427image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:23.561049image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:18.249566image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:19.008814image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:19.828352image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:20.571585image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:21.282346image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:21.995495image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:22.833175image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:23.651510image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:18.344177image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:19.097902image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:19.933147image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:20.660035image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:21.370944image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:22.082257image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:22.925346image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:23.741257image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:18.438781image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:19.186439image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:20.020534image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:20.746296image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:21.459943image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:22.167890image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:23.011265image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:23.831337image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:18.532245image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:19.273880image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:20.110201image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:20.833802image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:21.548566image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:22.252719image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:23.098505image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:23.920073image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:18.635465image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:19.359696image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:20.197441image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:20.916858image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:21.636824image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:22.336762image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:23.183789image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:24.009769image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:18.727917image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:19.445618image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:20.288016image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:21.000123image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:21.724470image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:22.420350image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T07:41:23.274014image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-04-29T07:41:26.851513image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-04-29T07:41:26.961044image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-04-29T07:41:27.068176image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-04-29T07:41:27.170635image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-04-29T07:41:27.261543image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-04-29T07:41:24.189402image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-04-29T07:41:24.376585image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexnameyearselling_pricekm_drivenfuelseller_typetransmissionownerseatsmileage KMPLengine CCpower BHP
00Maruti Swift Dzire VDI2014450000145500DieselIndividualManualFirst Owner5.0023.401248.0074.00
11Skoda Rapid 1.5 TDI Ambition2014370000120000DieselIndividualManualSecond Owner5.0021.141498.00103.52
22Honda City 2017-2020 EXi2006158000140000PetrolIndividualManualThird Owner5.0017.701497.0078.00
33Hyundai i20 Sportz Diesel2010225000127000DieselIndividualManualFirst Owner5.0023.001396.0090.00
44Maruti Swift VXI BSIII2007130000120000PetrolIndividualManualFirst Owner5.0016.101298.0088.20
55Hyundai Xcent 1.2 VTVT E Plus201744000045000PetrolIndividualManualFirst Owner5.0020.141197.0081.86
66Maruti Wagon R LXI DUO BSIII200796000175000LPGIndividualManualFirst Owner5.0017.301061.0057.50
77Maruti 800 DX BSII2001450005000PetrolIndividualManualSecond Owner4.0016.10796.0037.00
88Toyota Etios VXD201135000090000DieselIndividualManualFirst Owner5.0023.591364.0067.10
99Ford Figo Diesel Celebration Edition2013200000169000DieselIndividualManualFirst Owner5.0020.001399.0068.10

Last rows

df_indexnameyearselling_pricekm_drivenfuelseller_typetransmissionownerseatsmileage KMPLengine CCpower BHP
67078115Maruti 800 AC199740000120000PetrolIndividualManualFirst Owner4.0016.10796.0037.00
67088116Maruti Alto K10 VXI Airbag201734000045000PetrolIndividualManualFirst Owner5.0023.95998.0067.10
67098118Hyundai i20 Magna201338000025000PetrolIndividualManualFirst Owner5.0018.501197.0082.85
67108119Maruti Wagon R LXI Optional201736000080000PetrolIndividualManualFirst Owner5.0020.51998.0067.04
67118120Hyundai Santro Xing GLS2008120000191000PetrolIndividualManualFirst Owner5.0017.921086.0062.10
67128121Maruti Wagon R VXI BS IV with ABS201326000050000PetrolIndividualManualSecond Owner5.0018.90998.0067.10
67138122Hyundai i20 Magna 1.4 CRDi201447500080000DieselIndividualManualSecond Owner5.0022.541396.0088.73
67148123Hyundai i20 Magna2013320000110000PetrolIndividualManualFirst Owner5.0018.501197.0082.85
67158124Hyundai Verna CRDi SX2007135000119000DieselIndividualManualFourth & Above Owner5.0016.801493.00110.00
67168125Maruti Swift Dzire ZDi2009382000120000DieselIndividualManualFirst Owner5.0019.301248.0073.90